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EVoC Project 


Implementing a Software Scripting Engine on 
Fermi architecture based NVIDIA GPUs to 
achieve safe memory reclocking. 


How did | get to this? 


1. Web Developer 
2. KDE contributor 
3. Мо clue about X development 


The path 


e Onsite internships 
- Mozilla 
- Google 
- Apple 
e GSoC deadline crossed 


Two options 


1. Android App Development 
2. Nouveau 


The Project 


e Buying a new GPU - First NVIDIA card 
e Fermi and Kepler 
e Fermi memory reclocking 


The problem with Fermi 


nv50 
laptops -> reclock memory and 
engines. 

- Save power 

- Default clock speed : Medium 


nva : 
- load based reclocking 
- Default clock speed : 1/3 to 1/2 
- Low performance on Nouveau 


FERMI 


- Default clock speed : 10%!! 
- Miserable performance 


Process of Reclocking 


e nv50 style 
e Put card off the bus 
e wait and write MMIO registers 


The main issue 


e nv50 used HWSQ ( HardWare SeQuencer) 
e HWSQ removed on Fermi 
e Replaced by PDAEMON 


PDAEMON 


Full access to the registers 
Capable of IRQs 

Used for Hardware monitoring and 
Reclocking 

ISA: FuC (flexible microcode) 


Open-Source PDAEMON 


e Work done by Martin Peres ~mupuf 
- Host -> PDAEMON Communication 
- Fan Management 
- Works on nva3 to nvd9 
- Should work on Kepler 


My Proposed Work 


1. PDAEMON -> Host Communication 
2. HWSQ replacement 
3. Documentation 


PDAEMON -> Host 


e Ring Buffer 
- "GET / *PUT 
- "PUT writes 
- “GET reads 
e Each process sends 4 params 
1. Process ld 
2. Message Id 
3. Payload Size 
4. Payload pointer 


Basic checks 


Stop writing if buffer not read 

Stop reading if buffer empty 

Do not read if writing not complete 
Write if reading not complete 
Wrap around 


Status 


e PDAEMON -> HOST 
- TESTED 
- MERGED 


Fermi Scripting Engine (FSE) 


e HWSQ replacement 
e Capable of memory reclocking 


FSE Implementation Process 


1. Understanding HWSQ 
2. Designing the ISA 
3. Implementing it in FuC 


FSE Design 


ort Ee 


Full range Delay 

short range Delay 

MMIO write 

MMIO mask 

MMIO wait 

PDAEMON -> HOST message 


Delay Implementation 


e Short range: 
- 16bit Nano seconds 
- 16bit Micro Seconds 
e Full range 
- 64bit Nano seconds 


e Write 

- 8bit and 32bit 
e Mask 
e Wait 


Send msg 


e Hooks up with PDAEMON->Host 
e [akes two params 

o SIZE 

o MESSAGE 


Unexpected Hurdle 


e Planned demo for XDC 
e Unaligned memory access 
e Implemented Id 32, 4 16 and Іа 08 


Current Status 


e Most of it tested and working 
e Send msg needs to support "msg id" 
e Send msg needs pass testing 


Documentation 


RON ~ 


Blogpost introducing Nouveau basics 


. Complete EVoC documentation on blog 
. Intro.txt by mwk in envytools 


. More Documentation for Newbies! 


// Beginner's Guide to KDE Development 


Wrap Up 


1. PDAEMON -> HOST :success 
2. FSE: send msg testing left 
3. Documentation - Intro.txt & blogpost 


EVoC 


Endless Vacation of Code 


Propose a 13 week (3 Month) Project 
$5000 

o $1000 upfront 

o $2000 mid-term 

о $2000 completion 


Can start anytime 


EVoC suggestions 


Flexibility == Good 

Need more specific rules I= Refer GSoC 
selection completely on Mentor 
PreRequisites on Wiki 

Open Mentors listed on Wiki 


Thoughts on proposition by Martin 


Patch requirement compulsory? 

Limit a student to 2 EVoCs? NO?! 

Limit a student to 1EvoC/year? Yes. 
Upfront payment low? Yes. 

3 Month engagement before project? No! 


1. Something for Mentors? 


2. PUBLICIZE! 


Links 


1. https://gitorious.org/pdaemon 
2. supreetpal.blogspot.com 
3. IRC nick: supreet 


4. Email : supreetpal@gmail.com 


